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Abstract 

Smith  has  proposed  an  elegant  extension  of  the  ML  type  system  for  polymor¬ 
phic  functional  languages  with  overloading.  Type  inference  in  his  system  requires 
solving  a  satisfiability  problem  that  is  undecidable  if  no  restrictions  are  imposed 
on  overloading.  This  short  note  explores  the  effect  of  recursion  and  the  structure 
of  type  assumptions  in  overloadings  on  the  problem’s  complexity. 
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1  Introduction 

The  ML  type  system  [Mil78,  DaM82]  has  been  studied  extensively  and  has  some  well- 
known  limitations.  One  practical  limitation  is  that  it  prohibits  overloading  by  allowing 
no  more  than  one  assumption  per  identifier  in  a  type  context.  In  an  attempt  to  overcome 
this  limitation,  Smith  gives  an  elegant  type  system  that  merges  Mi-style  polymorphism 
and  overloading  [Smi91,  Smi93,  Smi94].  The  system  has  the  usual  set  of  unquantified 
types  given  by 

r  ::=  a  |  r  ->■  t'  |  x(n,  .  .  ,,rn) 

where  y  is  an  n-ary  type  constructor,  like  int  or  matrix,  and  a  is  a  type  variable. 

The  set  of  quantified  types,  or  type  schemes,  is  given  by 

<7  ::=  Van,  ■  ■  ■  ,exn  with  xx  :  Ti,  .  .  ,,xm  :  rm  .  r 

The  set  {aq,  .  .  . ,  an }  is  the  set  of  quantified  variables  of  a,  {xi  :  Ti,  .  .  . ,  xm  :  rm}  the  set 
of  constraints  of  a,  and  r  the  body  of  a.  If  there  are  no  constraints,  the  with  portion  of 
the  type  scheme  is  omitted. 

The  type  inference  rules  are  given  in  Figure  1.  We  can  prove  typing  judgements  of 
the  form  A  h  e  :  a  where  A  is  a  finite  set  of  assumptions  of  the  form  x  :  a,  called  a 
type  context.  A  type  context  A  may  contain  more  than  one  typing  for  an  identifier  x] 
in  this  case  we  say  that  x  is  overloaded  in  A.  We  use  the  notation  A  b  C  to  represent 
A  h  x  :  r,  for  all  x  :  r  in  C,  and  let  |A|  denote  the  number  of  assumptions  in  type 
context  A. 

1  Appeared  in  Information  Processing  Letters,  57(1),  pp.9— 14,  Jan  1996.  This  material  is  based  upon 
activities  supported  by  the  National  Science  Foundation  under  Agreement  No.  CCR-9400592.  Any 
opinions,  findings,  and  conclusions  or  recommendations  expressed  in  this  publication  are  those  of  the 
authors  and  do  not  necessarily  reflect  the  views  of  the  National  Science  Foundation. 
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(hypoth) 

(—►-intro) 

(— *-elim) 

(let) 

(V-intro) 

(V-elim) 

(=a) 


A  h  x  :  <r,  if  x  :  <r  e  A 

A  U  {x  :  t}  \~  e  :  t1 
A  h  Xx.  e  :  t  — >■  t! 

Ah  e  :  t  —>■  t' 

Ah  e'  :  t 
A  h  e  e'  :  t' 

Ah  e  :  a 

A  U  {x  :  a}  h  e'  :  r 
A  h  let  x  =  e  in  e'  :  r 

AUC  h  e  :  t 
A  h  C[a  :=  b] 

A  h  e  :  Va  with  C .  r 

A  h  e  :  Va  with  C .  r 
A  h  C[a  :=  7r] 

A  h  e  :  r[a  :=  tt] 


(x  does  not  occur  in  A) 


(x  does  not  occur  in  A) 


(a  not  free  in  A) 


Ah  e  :  a 


Ah  e  :  cr' 

Figure  1:  Typing  Rules  with  Overloading 
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+  :  real  —>■  real  —>■  real 
+  :  Va  with  +  :  a  —>  a  —>  a  . 

matrix  (a)  —>■  matrix  (a)  —>■  matrix  (a) 

*  :  int  —>■  int  —>■  int 

*  :  real  —>■  real  —>■  real 

*  :  Va  with  +  :  a  ^  a  ^  a,  *  :  a  ^  a  ^  a  . 

matrix  (a)  —>■  matrix  (a)  —>■  matrix  (a) 

Figure  2:  A  recursive  type  context 


Observe  that  when  C  is  empty  the  (V-intro)  and  (V-elim)  rules  reduce  respectively  to 
type  generalization  and  specialization  in  the  type  system  of  Damas  and  Milner  [DaM82]. 
The  second  hypothesis  of  the  (V-intro)  rule  says  that  a  constraint  set  can  be  moved  into 
a  type  scheme  only  if  the  constraint  set  is  satishable;  the  typing  A  be:  Vd  with  C  .  r 
cannot  be  derived  unless  we  know  that  there  is  some  way  of  instantiating  the  d  so  that 
C  is  satisfied.  This  leads  to  the  following  problem: 

Definition  1.1  CS-SAT  is  the  problem  of  given  a  constraint  set  C'  and  a  type  context 
A,  is  there  a  substitution  S  such  that  A  b  C'S? 

In  practice,  we  would  expect  a  type  inference  algorithm  to  generate  C  by  specializing 
the  constraints  of  type  schemes  in  A.  So  from  now  on,  we  assume  that  constraints  in  C 
can  be  formed  by  specializing  constraints  in  A. 

CS-SAT  is  undecidable  as  was  shown  by  Smith  [Smi91].  Type  schemes  in  this  system 
permit  very  expressive  type  contexts  to  be  constructed,  some  of  which  may  be  recursive 
in  the  following  sense. 

Definition  1.2  Let  S  =  {x,y.  .  .}  be  the  set  of  identifiers  with  assumptions  in  a  type 
context  A.  Then  the  dependency  relation  of  A  is  a  binary  relation  R  on  S  such  that  xRy 
iff  x  :  a  G  A  and  y  :  t  is  a  constraint  of  a  (x  and  y  are  not  necessarily  distinct). 

Definition  1.3  A  type  context  A  with  dependency  relation  R  is  recursive  iff  3x.  x  R+  x. 

For  example,  the  type  context  in  Figure  2  is  recursive  due  to  the  assumptions  for  * 
and  +.  As  a  result  of  recursion,  we  have  inhnitely-many  types  at  which  *  and  +  have 
instances. 


2  Lower  Bounds 

Although  CS-SAT  is  decidable  when  recursion  is  prohibited,  it  remains  hard  if  the  struc¬ 
ture  of  type  assumptions  is  not  restricted,  requiring  nondeterministic  exponential  time. 

Theorem  2.1  CS-SAT  is  NEXPTIME  complete  for  nonrecursive  type  contexts. 

Proof.  Given  a  nonrecursive  type  context  A  and  a  constraint  set  C ,  construct  a  set  E 
of  equations  as  follows.  For  each  constraint  x  :  r  £  C ,  nondeterministically  choose  an 
assumption  x  :  Vd  with  C"  .  t'  in  A,  add  to  E  equation  r  =  r'[ d  :=  j],  where  j  is  new, 
and  add  the  members  of  C'[a  :=  j]  to  C.  Then  E  is  unihable  iff  the  original  set  C  is 
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satisfiable  with  respect  to  A.  The  size  of  E  is  at  most  exponential  in  |A|.  Whether  E  is 
unihable  can  be  decided  in  linear  time  so  CS-SAT  is  in  NEXPTIME.  For  hardness,  let 

B  =  {cp(n)+ 1  :  e  — ►  go  — ►  ai  —>  a2  —> - >■  an  —>■  e} 

in  the  PTIME  reduction  of  Theorem  4.2  in  [VoS91];  a\02  ■  ■ ■ an  is  an  input  string  x 
to  a  nondeterministic  Turing  machine  M  of  exponential  time  complexity.  Then  B  is 
satisfiable  under  Ax  iff  M  accepts  x.  □ 

Next  we  consider  the  complexity  of  CS-SAT  with  the  style  of  overloading  permitted 
in  the  lazy  functional  programming  language  Haskell  [Has92].  An  identifier  may  be 
overloaded  in  Haskell,  but  only  through  multiple  instance  declarations,  which  give  rise 
to  what  we  call  a  Haskell  type  context.  A  Haskell  type  context  is  very  structured. 

Definition  2.1  Suppose  X  is  the  set  of  identifiers  of  a  nonempty  type  context  H .  Then 
H  is  a  Haskell  type  context  if  for  each  i£l,  there  is  a  type  tx  with  exactly  one  free 
type  variable  ax,  possibly  occurring  more  than  once,  such  that  all  assumptions  for  x  in 
H  are  described  by  the  set 

x  :  V71  with  Ci  .  Tx[ax  :=  Xi(7i)] 

<  : 

.  x  :  Vyn  with  C„  .  rx[ax  :=  Xn(jn)] 

where  n  >  1,  \i  7^  Xj  f°r  *  7^  j,  and 

y  :  p  6  Ci  implies  y  £  A ,  p  =  Ty[ay  :=  7],  and  7  £  7;. 

If  x  is  overloaded  (n  >  1)  then  the  type  tx  used  in  forming  the  assumptions  for  x  is 
the  least  common  generalization  [Rey70]  of  the  types 

rx[ax  ■■=  Xi(7i)]  ,  •  •  •,  rx[ax  :=  Xn(%)] 

If  x  has  only  one  assumption  (n  =  1),  then  it  is,  technically  speaking,  not  overloaded, 
nevertheless  it  may  appear  in  a  constraint.  The  body  of  a  type  scheme  in  an  assumption 
must  be  formed  by  specializing  a  type  with  one  level  of  type  structure,  or  in  other  words, 
by  a  single  type  constructor  x;  parameterized  perhaps  by  one  or  more  type  variables. 
Further,  in  a  constraint  y  :  p,  it  must  be  possible  to  form  p  by  merely  renaming  the  free 
type  variable  of  ry ,  the  type  used  in  forming  all  assumptions  for  y. 

The  recursive  type  context  of  Figure  2  is  an  example  of  a  Haskell  type  context.  Con¬ 
sider  the  assumptions  for  +.  The  bodies  of  its  type  schemes  are  formed  by  instantiating 
a  of  the  type  a  —>  a  —>  a  with  real  and  matrix  (a),  so  r+  =  a  —>  a  —>  a. 

Regular  tree  languages  recognized  by  DR  tree  automata  characterize  the  types  of 
identifiers  in  Haskell  type  contexts.  Formally,  a  k-ary,  F-valued  tree  is  a  mapping  t  : 
dom(t)  —>■  T,  where  dom(t)  C  {1,  .  .  . ,  k}*  is  a  nonempty  set  and  closed  under  prefixes.  We 
can  assume  for  our  purposes  that  F  is  a  ranked  alphabet  of  type  constructors  xi,  •  •  • ,  Xn- 
We  let  Fs  denote  the  set  of  all  finite  F-trees.  A  deterministic  root-to-frontier  (DR)  F- 
tree  automaton  M  is  a  pair  (A,  S)  such  that 

1.  A  is  a  finite  DR  F-algebra  (A,  F),  and 
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2.  S  G  A  is  the  initial  state. 


A  DR  E-algebra  is  a  pair  (A,  E)  where  A  is  a  nonempty  set  of  states  and,  for  every 
<7  G  E m  with  m  >  0  there  is  a  partial  transition  function  a A  :  A  — >■  Am .  If  c  £  Eo, 
then  cr^  C  A  [GeS84].  A  run  of  If  on  a  tree  t  is  a  mapping  g  :  dom(i)  — >■  A  such  that 
g(e)  =  S,  and  if  t(w)  =  a ,  cr  G  Em  for  m  >  0,  and  g(u>)  =  a,  then 

vA(a)  =  (g(wl),g(w2), g(wm)). 

A  run  g  of  M  on  a  tree  t  is  accepting  if  g(w)  G  t(w)"4  for  every  node  w  in  the  frontier 
of  t.  The  set  of  all  trees  on  which  M  has  accepting  runs  is  the  language  of  M ,  written 
T(AI). 

Deciding  whether  the  intersection  of  a  sequence  of  DR  tree  automata  is  nonempty  is 
EXPTIME  complete.  Hardness  can  be  proved  by  a  log-space  generic  reduction  from 
polynomial-space  bounded  alternating  Turing  machines  (ATMs)  [Sei94].  The  lower 
bound  on  CS-SAT  for  Haskell  type  contexts  follows  directly  from  this  result  while  the 
upper  bound  follows  from  the  following  lemma: 

Lemma  2.2  Suppose  H  is  a  Haskell  type  context.  If  x  is  an  identifier  of  H  then  a  DR 
tree  automaton  Mx  can  he  constructed  such  that 

T(MX)  =  {7T  |  H  h  x  :  rx[ax  :=  7r]}. 

Further,  such  automata  can  be  constructed  for  all  identifiers  of  H  in  time  exponential 
(space  polynomial)  in  \H\. 

Proof.  Given  a  Haskell  type  context  H ,  let  A  be  the  set  of  identifiers  of  H .  Let  A  =  p(X) 
and  E  be  the  ranked  alphabet  of  type  constructors  in  H .  Define  Mx  =  (A,  {*}),  where 
A  =  (A,  E),  and  A  is  initially  constructed  so  that  T(A,  {  })  =  Ts-  Then  extend  A  with 
the  following  transitions.  For  every  r  C  X,  let  r  G  XA  iff  all  identifiers  in  r  have  an 
instance  assumption  in  H  at  nullary  type  constructor  x-  Let  XA(r )  =  (ri>  •  •  • ,  rj,)  iff  all 
have  an  instance  at  type  constructor  x(ai,  ■  ■  -iak),  for  k  >  0.  Set  rj,  for  1  <  j  <  k, 
contains  an  identifier,  say  z,  iff  there  is  an  identifier  in  r  whose  instance  assumption  at 
X  has  a  constraint  on  z  involving  aj.  There  are  subsets  to  consider,  each  of  which 
can  be  stored  in  at  most  |X|  space.  □ 

The  following  theorem  was  proved  by  Seidl  [Sei94]  in  the  framework  of  Nipkow  and 
Prehofer’s  type  system  [NiPr93].  A  proof  is  given  here  for  Smith’s  system. 

Theorem  2.3  CS-SAT  is  EXPTIME  complete  for  Haskell  type  contexts. 

Proof.  Let  H  be  a  Haskell  type  context  and  C  a  constraint  set.  First  construct  a  DR 
tree  automaton  (A,  {*})  for  each  identifier  x  of  H .  By  Lemma  2.2,  this  can  be  done 
in  exponential  time.  Suppose  that  for  each  constraint  x  :  p  G  C ,  p  =  rx[ax  :=  7r],  for 
some  type  7 r.  Let  g  be  a  run  of  (A,  {*})  on  7r.  If  there  is  a  node  w  G  dom(ir)  such 
that  7r(u>)  =  Xm,  m  >  0,  and  g(w)  is  undefined,  then  reject.  If  7r  has  type  variables, 
then  associate  a  DR  tree  automaton  with  every  occurrence  of  a  variable  in  7r  as  follows. 
For  every  node  w  such  that  ir(w)  =  (3  and  g(w)  =  a,  assign  to  this  occurrence  of  type 
variable  /?,  tree  automaton  (A,  a). 
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Now  for  every  type  variable  a  in  C ,  with  occurrences  oq,  .  .  . ,  am,  decide  whether  the 
languages  of  their  assigned  tree  automata 

(A,  aQ,  ■  ■  -  ,(A,  am) 

intersect  by  checking  the  emptiness  of  automaton 


(A,  ai  U  •  •  •  U  am). 


This  can  be  done  in  PTIME  [Don70].  Then  accept  iff  this  intersection  is  nonempty  for 
all  type  variables  in  C. 

Hardness  follows  from  a  log-space  reduction  from  the  DR  tree  automaton  intersection 
problem.  Given  a  sequence  of  tree  automata  M i,  M2,  .  .  . ,  Mk,  where  Mi  =  (Ai,  Si)  and 
Ai  is  the  E-algebra  (Ai,  E),  we  construct  a  constraint  set  C  and  a  Haskell  type  context 
H  such  that  C  is  satishable  with  respect  to  H  iff  Hi=i  T(Mf)  is  nonempty.  Assume  the 
sets  of  states  A\,  .  .  . ,  A j.  are  disjoint.  For  all  1  <  i  <  k  add  to  H  a  set  of  assumptions 

for  Mi  as  follows.  If  <jA'(a)  =  (ai,  .  .  . ,  am),  for  <7  E  Em,  then  add  to  H , 

a  ;  Voq  •  •  •  exm  with  ci\  ;  oq ,  .  .  . ,  am  '.  exm  .  cj(exi ,  .  .  . ,  exm) 

and  for  <7  E  Eo,  add  a  :  a  for  all  a  E  crA' .  Then  with  C  =  {Si  :  a,  .  .  . ,  Sk  :  a},  we  have 

p|.=1  T(Mi)  is  nonempty  iff  C  is  satishable  with  respect  to  H .  □ 

Now  we  consider  the  complexity  of  CS-SAT  when  all  assumptions  for  an  overloaded 
identifier,  say  x,  in  a  Haskell  type  context  are  formed  by  specializing  tx  with  unary 
or  nullary  type  constructors  only.  An  example  of  this  kind  of  overloading  is  given  in 
Figure  2.  We  call  such  contexts  unary  Haskell  type  contexts. 

Definition  2.2  A  Haskell  type  context  H  is  a  unary  Haskell  type  context  if  all  assump¬ 
tions  for  every  identifier  x  of  H  have  the  form  x  :  rx[ax  :=  xo],  for  some  nullary  type 
constructor  xo,  or  x  :  V/3  with  C  .  rx[ax  :=  x(fi)\,  for  some  unary  type  constructor  x- 

Theorem  2.4  CS-SAT  is  PSPACE  complete  for  unary  Haskell  type  contexts. 

Proof.  Let  H  be  a  unary  Haskell  type  context  and  C  a  constraint  set.  Suppose  that  for 
each  constraint  x  :  p  E  C ,  p  =  rx[ax  :=  7r],  for  some  type  7r.  Let  g  be  a  run  of  (A,  {*})  on 
7r,  where  (A,  {*})  is  constructed  “on  demand”  and  in  polynomial  space  by  Lemma  2.2. 
As  in  Theorem  2.3,  assign  tree  automaton  (A,  a)  to  a  type  variable  [3  in  7r  if  7 r(w)  =  [3  and 
g(w)  =  a  for  some  node  w.  If  a  variable  a  in  C  has  occurrences  aq,  .  .  . ,  am,  then  check 
whether  the  languages  of  their  assigned  tree  automata  (A,  a  1),  .  .  . ,  (A,  am)  intersect  by 
checking  the  emptiness  of  (A,  aj  U  •  ■  •  U  am).  This  automaton  represents  a  DFA  since 
H  is  a  unary  Haskell  type  context.  Thus  emptiness  can  be  checked  by  searching  an 
on-demand  construction  of  it  nondeterministically  in  log  2^  space,  or  DSPACE(|Lf  |2). 

Hardness  follows  from  a  log-space  reduction  from  the  DFA  intersection  problem 
[Koz77].  Let  Mi,  M2,  .  .  .,  be  a  sequence  of  DFAs,  where  Mi  is  (Qi,H,8i,qoi,  Ff), 
and  the  sets  of  states  Qi,..  . ,Qk  are  disjoint.  Create  a  unary  Haskell  type  context  H 
where  for  all  1  <  i  <  k,  assumptions  are  added  to  H  for  Mi  as  follows.  For  all  q,  q'  E  Qi 
and  a  E  E  such  that  6i(q,  a)  =  q' ,  add  to  H , 

q  :  Va  with  q'  :  a .  a  (a) 
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and  for  all  q  £  F{  add  q  :  e  where  e  is  a  nullary  type  constructor.  Then  with  C  =  {rqo1  : 
ex,  .  .  . ,  q ok  :  ex},  p|i=i  L(Mi )  is  nonempty  iff  C  is  satishable  with  respect  to  H .  □ 

And  finally  we  consider  unary  Haskell  type  contexts  without  recursion. 

Theorem  2.5  CS-SAT  is  NP  hard  for  nonrecursive,  unary  Haskell  type  contexts. 

Proof.  The  proof  is  by  a  PTIME  reduction  from  3CNF-SAT.  Suppose  A  is  a  3CNF 
formula  with  clauses  d\,  .  .  . ,  dn  and  distinct  variables  x\,  .  .  . ,  xm.  The  satisfying  truth 
assignments  for  each  clause  d;  are  recognizable  as  unary  trees  by  a  DR  tree  automaton 
Mi  =  {Ai,  Si)  that  can  be  constructed  in  O(m)  time.  That  is,  ■  ■  Bm(e)  •••))£ 

T(Mi)  iff  the  assignment  of  truth  values  B\,  .  .  . ,  Bm  to  x\,  .  .  . ,  xm  respectively  satisfies 
d{.  Assume  the  sets  of  states  for  the  n  automata  are  disjoint.  Then  for  all  1  <  i  <  n, 
add  to  a  type  context  H ,  the  assumption 

a  :  Va  with  b  :  a  .  B(a) 

if  BA'(a)  =  b  and  a  :  e  if  a  £  eA' .  Then  {Si  :  a,  Sn  :  a}  is  satishable  under  H  iff  E 
is  satishable,  and  H  is  nonrecursive  and  unary.  □ 

3  Summary 

We  have  observed  that  it  is  necessary  to  simultaneously  restrict  both  recursion  in  over- 
loadings  and  the  structure  of  type  assumptions  if  there  is  to  be  any  hope  of  solving 
CS-SAT  efficiently.  Surprisingly,  even  for  the  simple  kind  of  overloading  allowed  in 
Haskell  the  problem  is  hard. 
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